Depression Detection & Emotion Classification via Data-Driven Glottal Waveforms

نویسنده

  • David Vandyke
چکیده

This doctoral consortium paper outlines the author’s proposed investigation into the use of the voice-source waveform for affective computing. A data-driven glottal waveform representation, previously examined in the authors earlier doctoral studies for its speaker discriminative abilities, is proposed to be studied for both depression detection and emotion recognition, including severity classification when considering depression. ‘Data-driven’ refers to a parameterisation focus on the small but consistent idiosyncrasies of the glottal wave rather than only the mean shape and ratio measures. A review of the literature is given covering existing studies of the glottal waveform for depression detection and emotion classification. The benefits of developing easily accessible automatic recognition systems is stressed. The value of developing objective tools for clinicians in diagnosing depression is also conveyed. Finally research questions are framed and experimental methodologies discussed in order to address these. The studies proposed here will expand the body of knowledge regarding the information content of the glottal waveform and aim to improve depression detection and emotion classification accuracies based on the voice-source alone. Keywords—Automatic Depression Recognition, Emotion Classification, Voice-Source Waveform, Glottal Waveform, Affect Classification

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification-Based Detection of Glottal Closure Instants from Speech Signals

In this paper a classification-based method for the automatic detection of glottal closure instants (GCIs) from the speech signal is proposed. Peaks in the speech waveforms are taken as candidates for GCI placements. A classification framework is used to train a classification model and to classify whether or not a peak corresponds to the GCI. We show that the detection accuracy in terms of F1 ...

متن کامل

Glottal Waveforms for Speaker Inference & A Regression Score Post-Processing Method Applicable to General Classification Problems

Contributions are made along two main lines. Firstly a method is proposed for using a regression model to learn relationships within the scores of a machine learning classifier, which can then be applied to future classifier output for the purpose of improving recognition accuracy. The method is termed r-norm and strong empirical results are obtained from its application to several text-indepen...

متن کامل

Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals

In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone f...

متن کامل

Methods for estimation of glottal pulses waveforms exciting voiced speech

Nowadays, the most popular techniques of the speech processing are the recognition of all kinds (the speech, the speaker and the state of speaker recog.) and the text-to-speech synthesis. In both these domains, there are possibilities to use the glottal pulses waveforms. In the recognition techniques we can use them for the vocal cords description and then use it for the classification of speak...

متن کامل

Generative Adversarial Network-Based Glottal Waveform Model for Statistical Parametric Speech Synthesis

Recent studies have shown that text-to-speech synthesis quality can be improved by using glottal vocoding. This refers to vocoders that parameterize speech into two parts, the glottal excitation and vocal tract, that occur in the human speech production apparatus. Current glottal vocoders generate the glottal excitation waveform by using deep neural networks (DNNs). However, the squared error-b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013